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Abstract 

Since the Space Shuttle Accident in 1986, NASA 
has been trying to incorporate probabilistic risk 
assessment (PRA) in decisions concerning the 
Space Shuttle and other NASA projects. One 
major study NASA is currently conducting is in the 
PRA area in establishing an overall risk model for 
the Space Shuttle System. The model is intended 
to provide a tool to predict the Shuttle risk and to 
perform sensitivity analyses and trade studies 
including evaluation of upgrades. Marshall Space 
Flight Center (MSFC) and its prime contractors 
including Pratt and Whitney (P&W) are part of the 
NASA team conducting the PRA study. MSFC 
responsibility involves modeling the External Tank 
(ET). the Solid Rocket Booster (SRB), the 
Reusable Solid Rocket Motor (RSRM), and the 
Space Shuttle Main Engine (SSME). A major 
challenge that faced the PRA team is modeling the 
shuttle upgrades. This mainly includes the P&W 
High Pressure Fuel Turbopump (HPFTP) and the 
High Pressure Oxidizer Turbopump (HPOTP). 
The purpose of this paper is to discuss the various 
methods and techniques used for predicting the 
risk of the P&W redesigned HPFTP and HPOTP. 

Introduction 

Redesigned components, subsystems, and 
systems are common in the Space Shuttle 
program as technology increases and as 
unforeseen problems arise. It is desirable to be 
able to accurately quantify the reliability of the 
redesigned component as well as the subsystem 
and system. With the case of the Space Shuttle, it 
can be seen how changing the reliability of a single 
component affects the overall reliability of the 
Space Shuttle. This can be very useful information 
when trying to determine which components to 
redesign. For example, two redesigned 
components both may affect the overall system 
reliability equally, but the cost of redesigning those 
two components may be considerably different. 
Therefore, an accurate redesign reliability system 
can be beneficial when allocating funds for various 
redesign tasks. Cost and reliability trade studies 


can be performed to introduce more flexibility into 
the design process of new hardware. 

The two most commonly used methods of 
obtaining redesign reliability are probabilistic 
structural analysis and similarity analysis. 1 The 
focus of this paper will be on probabilistic structural 
analysis. However, similarity analysis will be 
discussed briefly, and an example of similarity 
analysis will be provided. 

Similarity Method Reliability Predictions 

There are many methods and databases available 
to perform risk predictions using similarity analysis. 
Some of these methods are based on generic data 
such as MIL-STD-217 (used for electronic 
components) and NPRD95 (used for non- 
electronic components) while others are based on 
actual data. 2 Similarity analysis based on actual 
data will be discussed in this paper. 

Similarity Method Requirements 

The following requirements should be taken into 
account to perform effective reliability predictions: 

1 . The predictions must be established within the 
concept phase of the design. 

2. The most similar component must be used as 
the baseline. 

3. All applicable historical data should be used. 

4. The criticality category for each failure mode 
must be established and used. 

The prediclions should be initiated within the 
concept phase in coordination with the FMEA 
(Failure ModefEffects Analysis). The goal of the 
predictions, as with all reliability tasks, is to 
improve the reliability of the proposed design. The 
predictions will assist the reliability engineer and 
the designer in identifying the reliability concerns 
that have the greatest impact on the reliability of 
the product. Obviously, in an ideal world all the 
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reliability concerns should be addressed, but in the 
real world, money and schedules must be taken 
into account. Properly performed predictions 
provide the means to address the “big hitters”. 

Another important factor in establishing the 
predictions early in the design process is to use 
them during the trade studies. Design concepts 
may be traded very early within the design 
process. Therefore, to have an impact, the 
predictions, or at least baselines, must be in place 
to quantify the reliability concerns during the 
trades. 

The reliability of the most similar component 
should be used as the baseline. Predictions must 
take into account similarities as well as differences 
between the baseline design and operational 
environment. By choosing a baseline component 
that is most similar to the proposed design, fewer 
variables are present which can impact the 
accuracy of the predictions. 

All applicable data must be used in developing the 
predictions. In the rocket engine area, many 
believe only flight data should be used in 
establishing the reliability of the engines. However, 
ground testing may provide an even better 
indication of the true reliability, and the combination 
of the two provides a more complete picture. 
Obviously, the data have to be properly screened 
to eliminate data that are due to ground firings that 
do not represent the true design or realistic 
environments. Thus very early development 
designs and testing to extreme environments 
should be eliminated. The criticality category of 
each failure should be taken into account as part 
of the baseline assessment and reliability 
predictions. There is usually confusion with the 
reliability categories to anyone not familiar with 
them. When asked what is the reliability of a 
product, the answer should be: “Which reliability?". 

Why Perform the Reliability Predictions? 

Reliability engineers are frequently asked: “Why 
should we bother spending the time and money in 
performing predictions?”; “What good are they?”, 
“How will they make the product better?”; etc. The 
following provide a list of good answers: 

1. Allows the prioritizing of the high risk failure 
modes (more “bang for the buck” in the 
design). 


2. Obligates the design engineers to consider 
reliability equally with the other system 
parameters such as weight, cost, and 
performance. 

3. Provides guidance in the selection of design 
concepts through trade studies. 

4. Assists in quantifying the effects of design 
variability. 

5. Provides an early indication of meeting 
reliability goals. 

6. Enhances the effectiveness of the 
development test program. 

7. Provides input to the Life Cycle Cost Model. 

8. Establishes both scheduled and unscheduled 
maintenance requirements. 

9. Provides input to the Logistics Support 
Analysis. 

10. Cuts unscheduled maintenance time to repair 
by allowing the design to accommodate the 
most unreliable components. 

The bottom line is that if timely predictions are 
performed, the impact on the design and 
subsequent operating costs can be enormously 
beneficial to the product. 

Similarity Method Prediction Example 

To show how this method works in developing a 
reliability prediction, consider a fuel turbopump 
example. The task is to estimate the reliability of a 
new turbopump design. The first step is to 
establish the historical data for the reliability 
baseline. This number is derived using the 
binomial method with a 90% statistical confidence. 
This high level of confidence is necessary to 
ensure the baseline has a high level of accuracy. 
Assume the turbopump failure rate is 50 failures 
per 100,000 engine firings. Now the distribution of 
the piece part failures is considered to identify the 
big hitters needing improvement. 

Having identified the turbine blade as the piece 
part with the highest failure rate, the causes of 
failures must now be identified in order to identify 
and quantify the potential fixes. The turbine blades 
were the number one contributor with a percentage 
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of 35% or 17.5 failures per 100,000 engine firings. 
Let’s further assume that the distribution for the 
turbine blades shows that the number one cause is 
thermal stresses which have 57% of the total blade 
failures or a rate of 10 per 100,000 firings. 

Next, the new design is evaluated and the baseline 
failure rates are adjusted. These adjustments, in 
order of preference, are established through 
testing, analyses, or expert opinion (Delphi 
approach). Returning to the example, and 
addressing the “heavy hitters" of thermal stresses, 
the proposed design is assumed to have modified 
the operating temperatures, incorporated a hollow 
blade design, and incorporated a material change. 
Using the adjustment for each of these changes, 
the cumulative failure rate adjustment is 5.52 per 
100,000 engine firings. Therefore, the fuel 
turbopump failure rate for the proposed design, if 
no other changes are made, drops from the 
original 50 to 44.48 failures per 100,000 firings. 

Actual Similarity Method Prediction Example 

This method has been successfully used on 
previous jet engine programs. In the early 1980’s, 


the Air Force wanted a jet fighter engine with 
increased reliability and lower maintenance 
requirements to decrease life cycle costs. A new 
engine design and development program was 
initiated, and the reliability predictions were 
established using the previous engine as the 
baseline. These predictions were made 
approximately 54 months prior to operation. 

As with the previous example, the components 
were assessed at the part and failure mode levels. 
When the analysis was complete, the predicted 
failure rate of the new engine was approximately 
25% of the previous engine's failure rate. This 
dramatic increase in reliability was due to a very 
effective use of lessons learned and reliability data. 

When the total operating time of the new engine 
reached the point (T 0 ) where the prediction was 
based, a comparison of the actual failure rate 
versus the predicted failure rate was made. A 
difference of only 4% existed between the 
predicted value and the actual value. Figure 1 
shows the previous actuals, the point of estimate, 
and the delta between the estimate and the new 
actuals. 
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Prediction within 4% of actual. 


Figure 1 Similarity predictions can provide a high degree of accuracy 


Probabilistic Structural Reliability Predictions 

Unlike similarity analysis, probabilistic structural 
analysis uses the actual design structural failure 
mode model to calculate reliability predictions. 3 
These predictions are not based on similar 
components and past test experience. They are 


based on the structural model and the variation of 
the parameters or input variables in the structural 
model. Probabilistic design methodology 
considers statistical distributions of the life- 
controlling variables thus providing a distribution of 
component reliability. Probabilistic design 
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methodology must be integrated within an 
organization’s current design system. 4 

Monte Carlo simulation (where feasible) is the 
most accurate probabilistic methodology as the 
sample size increases to infinity. Monte Carlo 
simulation entails characterizing each input or life- 
controlling variable with a distribution and then 
simulating from this distribution a large number of 
times. This large number of combinations of 
simulated random variables is then run through the 
life equation or design code and a large number of 
output or life variables is obtained. A distribution 
can then be fit to the output or life variable which 
then can be evaluated at the desired point of 
interest (failure criterion). This evaluation will 
determine the reliability of the component. From 
this reliability number, future failures can be 
predicted over the course of the life of the 
component. A typical Monte Carlo simulation 
flowchart is shown in Figure 2. 

Some design codes are very complex and time- 
consuming thus prohibiting thousands of Monte 
Carlo Simulations. In these cases, alternative 
methods are applied that use fewer design-codes 


runs to approximate the “true” structural models. 
An example of this is a Response Surface Monte 
Carlo simulator where a Box Behnken or Central 
Composite designed experiment is used to 
systematically make a small number of design 
code runs in order to fit a response surface to the 
output data points. 5 A Box Behnken design (See 
Figure 3) is a three-level design that is used to fit 
response surfaces. Points in the design space are 
systematically chosen for each random variable (P- 
fca, |i ( p+ka) where k is a constant such that all 
main effects, 2-way interactions, and 2 nd order 
terms can be estimated (See Table 1 ). 6 

A Central Composite design (See Figure 4) is also 
a 2 nd order design that is a factorial or fractional 
factorial design with the addition of center points 
and star points. This design therefore has five 
levels (n-fto, -a p, a, p+kcr) where a is the “star” 
point that is chosen to allow estimation of the 2 nd 
order terms (See Table 2 ). 7 Unless a response 
surface is highly nonlinear, Box-Behnken and 
Central Composite designs usually estimate the 
“true” response surface very accurately. 
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Figure 2 Monte 


Table 1 A Three-Variable Box-Behnken Design 
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-1 
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— s — 

—T~ 

“5“ 

1 

7 

1 

0 

-1 

8 

1 

0 

1 

9 

0 

-1 

-1 

10 

0 

-1 

1 

11 

0 

1 

-1 

12 

0 

1 

1 

13 

0 

0 

0 

14 

0 

0 

0 

15 

0 

0 

0 


4 

American Institute of Aeronautics and Astronautics 






AIAA-98-1938 



Figure 3 A Three Variable Box-Behnken Design 

Table 2 A Three-Variable Central Composite Design 



a.) Cube Portion 

(± 1 .± 1 ,± 1 ) 


b.) “Star” Portion 
(±“, 0 , 0 ) 
(0, ±«, 0) 
(0, 0, ±a) 


c.) Center Points d. 

( 0 , 0 , 0 ) 


Cube + Star + Center 
(± 1 ,± 1 ,± 1 ), 
(±a, 0, 0) 

(0, ±a, 0) 

(0, 0, ±a) 

(0, 0, 0) 


Figure 4 A Central Composite Design for Three Variables 


Box-Behnken Matrix 


X! X 2 X p 


1 

-1 

0 


-1 

2 

-1 

0 


+1 

3 

+1 

0 


-1 

4 

+1 

0 


+1 

n 

0 

-1 


+1 


Ameri 


Design Code/ 
Software 

(e.g., FEM, thermal model) 

Inputs: X t ... X p 
Output: Y 


. 

Outputs 

Y, 

'■■■' ""T 

s and As 

s 

Y„ 







AIAA-98-1938 



Iterate n Times 


Figure 5 Response Surface Monte Carlo Flowchart 


The complex design code is therefore 
approximated with a response surface regression 
model and a Monte Carlo simulator can be run 
using this regression model. The large number of 
iterations using this regression model will be much 
faster than using the complex design code. 
Finally, a distribution can be fit to the output or life 
variable and this distribution can be evaluated 
where desired. A flowchart of the Response 
Surface Monte Carlo is shown in Figure 5. 

Several other probabilistic methodologies exist 
when Monte Carlo cannot be used due to time 
and/or cost constraints. These methods all 
attempt to maintain high accuracy while saving 
considerable time in the reliability analysis. 
Response Surface Monte Carlo as previously 
described is heavily dependent on being able to 
find an accurate yet simplified approximation to the 
complex design code. The response surface 
model can be checked for goodness of fit by 
checking several criteria (R 2 , residual plots, p- 
values, influential points, etc.). If the fitted 
response surface model “passes” all of the criteria 


for a good fitting model, then it would seem 
reasonable to use this approximated model in the 
Monte Carlo Simulation. 8 

In addition to Response Surface Monte Carlo 
simulation, there are many other probabilistic 
numerical methods used for calculating 
component reliability. These methods can be 
grouped into approximately five common groups of 
probabilistic methods. The first common group is 
simulation methods which contains Monte Carlo 
Simulation, Directional Simulation, and Latin 
Hypercube Simulation. These methods are usually 
the most accurate (especially Monte Carlo 
Simulation) but are also often time consuming and 
computer intensive. The second group is 
response surface/designed experiments which 
contains Response Surface Monte Carlo using a 
Box Behnken, Central Composite, or a variety of 
other designed experiments. These methods are 
very accurate if the design code or life equation 
can be accurately estimated with a response 
surface model. The third group of probabilistic 
methods includes First Order Reliability Method 
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(FORM), Second Order Reliability Method 
(SORM), and Fast Probability Integration (FPI). 
The accuracy of these methods decreases as the 
number of random variables increases. The fourth 
group of probabilistic methods is importance 
sampling which includes the Importance Sampling 
Method using factor and radius (ISAMF & ISAMR), 
Adaptive Importance Sampling 1 st Order (AIS1), 
and Adaptive Importance Sampling 2 nd Order 
(AIS2). These methods sample around the failure 
region rather than the entire sample space, thus 
saving sampling and simulation time. Finally, the 
fifth group of probabilistic methods falls into the 
mean-based methods category and includes the 
Advanced Mean Value Method (AMV), and the 
Advanced Mean Value plus iterations (AMV+). 
These methods utilize point expansion and 
perturbation techniques to estimate the failure 
probability. 9 

Probabilistic structural analysis makes use of 
structural models to calculate a failure mode 
distribution. Each input variable (temperature, 
stress, etc.) is considered as a random variable. 
These variables are then simulated from a 
historically-based or assumed distribution and 
input into the structural model. This process is 


repeated a large number of times to produce a 
failure mode distribution. This distribution can be 
evaluated at various points to determine a 
probability of failure (i.e. cumulative failure percent 
by 60 missions). 

Various failure modes such as low cycle fatigue 
(LCF), high cycle fatigue (HCF), fracture life, 
margin of safety (M.O.S.), and thermal mechanical 
fatigue (TMF) can be analyzed probabilistically. As 
long as a structural model can be defined that 
predicts these failure modes and the distributions 
for all the input variables included in the structural 
model can be defined, a probabilistic analysis can 
be performed. Input variable distributions may 
come from historical data (test data) or 
engineering assessment. These distributions may 
be Beta, Exponential, Lognormal, Normal, Weibull, 
or any other distribution that fits the data accurately 
(see Figure 6). The values of the input variables 
(stress, temperature, etc.) are then simulated from 
these distributions a large number of times forming 
many different combinations of the input variables. 
These various combinations are then run through 
the structural model (LCF, HCF, etc.) to obtain a 
distribution of the output variable. 






Normal Weibull 


Figure 6 Possible Shapes for the Beta, Exponential, Lognormal, Normal, and Weibull Distributions 

Probabilistic Structural Analysis Example 
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A probabilistic structural analysis example of an 
SSME HPOTP 1-2 Turbine Spacer Margin of 
Safety (M.O.S.) follows: 


( 


M.O.S.- 


' UTS 


-1 


V ^104% 


(nom) 


where: 


(j = Ultimate Tensile Strength (KSI) 


04% (nom) 


Nominal Operating Stress (KSI) for 
a 104% Nominal Mission 


a uTs ~ f^D where T is Temperature ( °F ) 


(M.O.S.) Ultimate. First the variables temperature 
and stress (cr 104 ) are simulated from normal 
distributions. Then temperature is used in a 
regression model (with error) to predict Ultimate 
Tensile Strength (UTS). The resulting ultimate 
tensile strength and stress values are used in the 
Margin of Safety equation to calculate a value for 
Margin of Safety. This process is repeated a large 
of number of times and a distribution is fit to the 
Margin of Safety values. 



Figure 7 M.O.S. Failure Mode Distribution Flowchart 


Uncertainty Distributions 
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In the Probabilistic Structural Analysis Example, 
the input variables were simulated from normal 
distributions or predicted from regression models 
to form a distribution of margin of safety (M.O.S.). 
This distribution was analyzed to determine the 
probability of having a margin of safety less than 
zero (which constitutes a failure). Analyzing the 
resulting failure distribution for margin of safety at 
zero results in a single probability estimate of 
failure. To obtain the uncertainty about this 
estimate, several methods are proposed. 

Hyperparameterization 

Each input variable was simulated from a 
distribution characterized by one or more 
parameters. Placing variation on these 
parameters due to uncertainty and running a 


Monte Carlo simulation with a different parameter 
each time would result in a large number of 
different margin of safety failure mode distributions 
when finished. Then each failure mode distribution 
could be analyzed at zero to obtain the probability 
of failure. Therefore, a distribution of failure 
probabilities would be obtained. For instance, 
instead of saying that the probability of fracturing a 
1-2 Turbine Spacer due to negative margin of 
safety was 1.0 x 10' 9 * , it could now be stated that 
the range of failure probabilities was between 1 .0 x 
10' 11 to 1.0 x 10"' with 95% confidence. This 
method seems the most realistic since the 
designer/engineer may not be 100% confident of 
the true mean and variation of the minimum and 
maximum temperatures and stresses. By allowing 
these parameters to vary, some of the uncertainty 
that exists in these variables is captured. 


Iterate 
n Timed 


(^START^) 


tf Temp — N(h.<t) then simulate distribution 
parameters (n,a) 

T,mP 
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Figure 8 MOS 

As shown in Figure 8, first the normal distribution 
parameters (n,, cr ( ) for temperature are simulated 
from uniform distributions. Then the normal 


Evaluate Failure Mode 
Distribution to Determine 
Failure Percent 


Uncertainty 

Distribution 

1 / — \ 

Fail 


1 / 

Fail 


Fail •*. 

iv DibinuuuuM i nyuerp'c 

irameterization) Howchart 


distribution parameters (P 2 . CT 2 ) for ct io 4 (stress at 
104% operating conditions) are simulated from 
uniform distributions. The temperature 


9 

American Institute of Aeronautics and Astronautics 











AIAA-98-1938 


parameters are then used to simulate a 
temperature value from a normal distribution (F 1( 
cr,). This value of temperature is used to predict a 
value of o UTS (ultimate tensile strength). The cj 104 
parameters are then used to simulate a cr 104 value 
from a normal distribution (F 2 , ct 2 ). These values 
are used to calculate a margin of safety (M.O.S.) 
value. Temperature and a 104 values are repeatedly 
simulated and a distribution is fit to the M.O.S. 
values. This distribution is evaluated at the failure 
criterion to determine the probability of failure. 
This entire process is repeated (beginning with 
simulating new normal distribution parameters for 
both temperature and cr 104 ) until a distribution can 


be fit to the probability of failure values, thus 
representing the uncertainty distribution for 
probability of failure. 

Bootstrapping 

Bootstrapping would be a final possibility to obtain 
an uncertainty distribution. Bootstrapping, a form 
of resampling with replacement from a data set, is 
frequently used when there is a lack of data. 10 The 
resampling produces many “pseudo” data sets 
from the original data set. These new “pseudo” 
data sets can then be analyzed to obtain a 
distribution of the parameters of interest. 


Iterate 
n Times 



Figure 9 M.O.S. Uncertainty Distribution (Bootstrapping) Flowchart 


Resampling with replacement from the failure 
mode distribution would produce many different 
failure mode distributions which could be analyzed 
at the desired point of interest. This method also 
captures the uncertainty about the original failure 
mode distribution. As shown in Figure 9, first. 


temperature and <J 104 are simulated from normal 
distributions (F, a). Then temperature is used to 
predict cr UTS based on a material regression model. 
Margin of Safety (M.O.S.) is calculated using o 104 
and cr UTS This process is repeated a large number 
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of times and a distribution is fit to the Margin of 
Safety (M.O.S.) values. 

This distribution is then evaluated at the failure 
criterion to determine a failure probability. Then a 
new sample with replacement is taken from the 
original simulated sample of M.O.S. values and a 
distribution is fit to this sample and evaluated at 
the failure criterion. This process of resampling 
from the original M.O.S. sample is repeated a 
large of number of times. Finally, a distribution is 
fit to the failure probabilities, thus forming the 
uncertainty distribution for the failure probabilities. 

Conclusions 


Probabilistic structural analysis and similarity 
analysis are two common analysis techniques 
used for assessing the reliability of redesigned 
hardware. Similarity analysis is used when there is 
a lack of data concerning the proposed redesigned 
component. Probabilistic structural analysis can 
be used when a structural model and the 
corresponding input variables are well defined. 
Monte Carlo Simulation is the most common form 
of probabilistic structural analysis and is the most 
accurate. However, sometimes Monte Carlo 
Simulation cannot be performed due to the 
complexity and time consuming nature of a 
particular design code. There are many other 
probabilistic methods that can be applied in these 
situations. Probabilistic structural analysis was 
applied to a variety of engineering problems and 
the resulting failure mode and uncertainty 
distributions were calculated. Overall, probabilistic 
structural analysis and similarity analysis can be a 
very useful tool for determining the reliability of 
redesigned components. 
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